AITopics | target value

f1507aba9fc82ffa7cc7373c58f8a613-Paper.pdf

Neural Information Processing SystemsApr-27-2026, 18:39:48 GMT

datapoint, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

0b5669c3b07bb8429af19a7919376ff5-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 12:32:43 GMT

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)

Add feedback

Mildly Conservative Q-Learning for Offline Reinforcement Learning

Neural Information Processing SystemsApr-24-2026, 12:32:39 GMT

Offline reinforcement learning (RL) defines the task of learning from a static logged dataset without continually interacting with the environment. The distribution shift between the learned policy and the behavior policy makes it necessary for the value function to stay conservative such that out-of-distribution (OOD) actions will not be severely overestimated. However, existing approaches, penalizing the unseen actions or regularizing with the behavior policy, are too pessimistic, which suppresses the generalization of the value function and hinders the performance improvement. This paper explores mild but enough conservatism for offline learning while not harming generalization. We propose Mildly Conservative Q-learning (MCQ), where OOD actions are actively trained by assigning them proper pseudo Qvalues. We theoretically show that MCQ induces a policy that behaves at least as well as the behavior policy and no erroneous overestimation will occur for OOD actions. Experimental results on the D4RL benchmarks demonstrate that MCQ achieves remarkable performance compared with prior work. Furthermore, MCQ shows superior generalization ability when transferring from offline to online, and significantly outperforms baselines. Our code is publicly available at https://github.com/dmksjfl/MCQ.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Sobolev Training for Neural Networks

Neural Information Processing SystemsMar-17-2026, 15:09:51 GMT

At the heart of deep learning we aim to use neural networks as function approximators - training them to produce outputs from inputs in emulation of a ground truth function or data creation process. In many cases we only have access to input-output pairs from the ground truth, however it is becoming more common to have access to derivatives of the target output with respect to the input -- for example when the ground truth function is itself a neural network such as in network compression or distillation. Generally these target derivatives are not computed, or are ignored. This paper introduces Sobolev Training for neural networks, which is a method for incorporating these target derivatives in addition the to target values while training.

artificial intelligence, machine learning, neural network, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Reward-Directed Conditional Diffusion: Provable Distribution Estimation and Reward Improvement

Neural Information Processing SystemsFeb-16-2026, 20:55:35 GMT

We explore the methodology and theory of reward-directed generation via conditional diffusion models.

artificial intelligence, diffusion model, machine learning, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Industry: Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)

Add feedback

Reward-Directed Conditional Diffusion: Provable Distribution Estimation and Reward Improvement

Neural Information Processing SystemsFeb-16-2026, 20:55:31 GMT

We explore the methodology and theory of reward-directed generation via conditional diffusion models.

artificial intelligence, diffusion model, machine learning, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Industry: Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)

Add feedback

Appendices A Benchmark Details The full set of primitive tasks is summarized in Table 1 and the statistics detailed in Table 2

Neural Information Processing SystemsFeb-16-2026, 09:02:49 GMT

Figure 9: 3D Shapes Modify wall color from source value to target value. Figure 10: 3D Shapes Modify orientation from source value to target value. Figure 11: BitMoji Faces Modify facial skintone from source value to target value.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Technology: